On the Parameterized Complexity of Consensus Clustering
نویسندگان
چکیده
Given a collection C of partitions of a base set S, the NP-hard Consensus Clustering problem asks for a partition of S which has a total Mirkin distance of at most t to the partitions in C, where t is a nonnegative integer. We present a parameterized algorithm for Consensus Clustering with running time O(4.24 ·k+ |C| · |S|), where k := t/|C| is the average Mirkin distance of the solution partition to the partitions of C. Furthermore, we strengthen previous hardness results for Consensus Clustering, showing that Consensus Clustering remains NP-hard even when all input partitions contain at most two subsets. Finally, we study a local search variant of Consensus Clustering, showing W[1]-hardness for the parameter “radius of the Mirkin-distance neighborhood”. In the process, we also consider a local search variant of the related Cluster Editing problem, showing W[1]-hardness for the parameter “radius of the edge modification neighborhood”.
منابع مشابه
Entropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملAverage Parameterization and Partial Kernelization for Computing Medians
We propose an effective polynomial-time preprocessing strategy for intractable median problems. Developing a new methodological framework, we show that if the input instances of generally intractable problems exhibit a sufficiently high degree of similarity between each other on average, then there are efficient exact solving algorithms. In other words, we show that the median problems Swap Med...
متن کاملA new ensemble clustering method based on fuzzy cmeans clustering while maintaining diversity in ensemble
An ensemble clustering has been considered as one of the research approaches in data mining, pattern recognition, machine learning and artificial intelligence over the last decade. In clustering, the combination first produces several bases clustering, and then, for their aggregation, a function is used to create a final cluster that is as similar as possible to all the cluster bundles. The inp...
متن کاملON FUZZY NEIGHBORHOOD BASED CLUSTERING ALGORITHM WITH LOW COMPLEXITY
The main purpose of this paper is to achieve improvement in thespeed of Fuzzy Joint Points (FJP) algorithm. Since FJP approach is a basisfor fuzzy neighborhood based clustering algorithms such as Noise-Robust FJP(NRFJP) and Fuzzy Neighborhood DBSCAN (FN-DBSCAN), improving FJPalgorithm would an important achievement in terms of these FJP-based meth-ods. Although FJP has many advantages such as r...
متن کاملThe ensemble clustering with maximize diversity using evolutionary optimization algorithms
Data clustering is one of the main steps in data mining, which is responsible for exploring hidden patterns in non-tagged data. Due to the complexity of the problem and the weakness of the basic clustering methods, most studies today are guided by clustering ensemble methods. Diversity in primary results is one of the most important factors that can affect the quality of the final results. Also...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011